Accelerating External Sorting via On-the-fly Data Merge in Active SSDs

نویسندگان

  • Young-Sik Lee
  • Luis Cavazos Quero
  • Youngjae Lee
  • Jin-Soo Kim
  • Seung Ryoul Maeng
چکیده

The concept of active SSDs (solid state drives) has been introduced in order to cope with the demands required to process the ever-increasing volumes of data. In active SSDs, some of the data-processing tasks are offloaded to SSDs, freeing host system resources and improving overall performance of data analysis. In this paper, we propose a novel active SSD architecture focused on improving the external sorting algorithm that is used extensively in data-intensive computing. By performing merge operations on-the-fly in active SSDs, our method can remove the extra data transfer and enhance the lifetime of SSDs. Our evaluation results on a real SSD platform indicate that the proposed scheme outperforms the traditional external sorting by up to 39%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

External Sorting on Flash Memory Via Natural Page Run Generation

The increasing popularity of flash memory means more database systems will run on flash memory in the future. One of the most important database operations is the external sort. Hence, this paper is focused on studying the problem of efficient external sorting on flash memory. In contrast to most previous work, we target the situation where previously sorted data has become progressively un-sor...

متن کامل

BigSparse: High-performance external graph analytics

We present BigSparse, a fully external graph analytics system that picks up where semi-external systems like FlashGraph and X-Stream, which only store vertex data in memory, left off. BigSparse stores both edge and vertex data in an array of SSDs and avoids random updates to the vertex data, by first logging the vertex updates and then sorting the log to sequentialize accesses to the SSDs. This...

متن کامل

PatTrieSort - External String Sorting based on Patricia Tries

External merge sort belongs to the most efficient and widely used algorithms to sort big data: As much data as fits inside is sorted in main memory and afterwards swapped to external storage as so called initial run. After sorting all the data in this way block-wise, the initial runs are merged in a merging phase in order to retrieve the final sorted run containing the completely sorted origina...

متن کامل

Sorting in Parallel Database Systems

Sorting in database processing is frequently required through the use of Order By and Distinct clauses in SQL. Sorting is also widely known in computer science community at large. Sorting in general covers internal and external sorting. Past published work has extensively focused on external sorting on uni-processors (serial external sorting), and internal sorting on multiprocessors (parallel i...

متن کامل

Parallel database sorting

Sorting in database processing is frequently required through the use of Order By and Distinct clauses in SQL. Sorting is also widely known in computer science community at large. Sorting in general covers internal and external sorting. Past published work has extensively focused on external sorting on uni-processors (serial external sorting), and internal sorting on multi-processors (parallel ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014